An OLAP Requirements Example: CompSales International (part 3) - Creating Data Source Views

12/13/2010 11:15:23 AM

Creating Data Source Views

Because you will be basing your cube on a data warehouse/data mart star schema you already have available, you need to further define exactly what you need to have access to within that data source. Creating a data source view essentially allows you to look more deeply into the metadata of the data source and add additional relationships, create things like calculations, and set logical keys on the metadata of the data source. You start by right-clicking the Data Source View object in the Solution Explorer and selecting New Data Source View (or choosing Project, New Data Source View). This starts the Data Source View Wizard, which you use to define what view of data to use for the cube. The first dialog box allows you to select a data source to use as the basis of the data source view.

Figure 6 shows the data source Comp Sales2008 that you defined earlier. Chose it and click Next.

Figure 6. Identifying which data source to use for the view in the Data Source View Wizard.

If you need to limit the data source to a particular schema within the database, you can click the Advanced button and specify a schema (or schemas) to be restricted to and retrieve any foreign key and primary key relationships that may exist.

If your schema doesn’t include foreign key specifications, you can use this wizard to try to discover foreign key relationships, using a few different types of column name matching. Figure 7 shows an example of using a simple primary key column name matching technique to identify any foreign key relationships with other tables in your schema. If you have used some type of common naming convention on your source tables, you can easily leverage this name-matching dialog.

Figure 7. Logical relationship discovery, using column-naming matching in the Data Source View Wizard.

You essentially can identify the following:

Matches based on the exact column name match (as compared to the primary key column):
Order.CustomerID (foreign key) → Customer.CustomerID (primary key)
Matches based on the column name being the primary key table name:
Order.Customer → Customer.CustomerID (primary key)
Matches based on similar column name by comparing the table name concatenated with its primary key column name and then loosely comparing it to other column names of other tables:
Order.CustomerID → CustomerID (concatenated to Customer+ID=CustomerID)
Order.Customer ID → CustomerID
Or Order.Customer_ID → CustomerID

In this example, we used some good naming conventions for columns, so you can simply specify the first option (match based on exact column match). This is the lead-in to select the tables (and/or views) you need to be included from your data source. As you can see in Figure 8 , you can choose from any number of objects. You must select the base tables you need in your data source views. These are the CompSalesFactoid, Geo_Dimension, Prod_Dimension, and Time_Dimension tables. However, you should also click the Add Related Tables button to add all related tables, based on the matching technique you specified earlier. We’ve seen a little inconsistency of the wizard not adding all related tables properly. Please double check the list of tables with our figures list (Figure 8 ). This completes the set of tables that comprise the data source views for your cube.

Figure 8. Available and included objects for your data source views in the Data Source View Wizard.

You now complete this wizard by naming the data source views (Comp Sales2008 DSV) and clicking Finish.

When you exit the wizard, you end up in the designer view in Visual Studio, with a graphical representation of the data source views that will be the basis of the cube you are building (see Figure 9). This figure highlights the primary fact table (CompSalesFactoid), the primary dimension tables (Time_Dimension, Prod_Dimension, Geo_Dimension), and all tables related to these dimensions (that contain the values/descriptions of the member entries for the hierarchies of the dimensions).

Figure 9. A designer graphical representation of the data source views.

Now, because you have fully specified data source views, you can easily define a cube via the Cube Wizard. Or you can start defining your cube’s dimensions and then use these dimensions in the Cube Wizard later. Because you know your source database well, you should go ahead and create your dimensions and hierarchies first.